NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Where to Intervene: Action Selection in Deep Reinforcement Learning

Zhang, Wenbo; Cai, Hengrui (June 2025, Transactions on machine learning research)

Deep reinforcement learning (RL) has gained widespread adoption in recent years but faces significant challenges, particularly in unknown and complex environments. Among these, high-dimensional action selection stands out as a critical problem. Existing works often require a sophisticated prior design to eliminate redundancy in the action space, relying heavily on domain expert experience or involving high computational complexity, which limits their generalizability across different RL tasks. In this paper, we address these challenges by proposing a general data-driven action selection approach with model-free and computational-friendly properties. Our method not only selects minimal sufficient actions but also controls the false discovery rate via knockoff sampling. More importantly, we seamlessly integrate the action selection into deep RL methods during online training. Empirical experiments validate the established theoretical guarantees, demonstrating that our method surpasses various alternative techniques in terms of both performances in variable selection and overall achieved rewards.
more » « less
Free, publicly-accessible full text available June 10, 2026
Where to Intervene: Action Selection in Deep Reinforcement Learning

Zhang, Wenbo; Cai, Hengrui (June 2025, Transactions on machine learning research)

Deep reinforcement learning (RL) has gained widespread adoption in recent years but faces significant challenges, particularly in unknown and complex environments. Among these, high-dimensional action selection stands out as a critical problem. Existing works often require a sophisticated prior design to eliminate redundancy in the action space, relying heavily on domain expert experience or involving high computational complexity, which limits their generalizability across different RL tasks. In this paper, we address these challenges by proposing a general data-driven action selection approach with model-free and computational-friendly properties. Our method not only selects minimal sufficient actions but also controls the false discovery rate via knockoff sampling. More importantly, we seamlessly integrate the action selection into deep RL methods during online training. Empirical experiments validate the established theoretical guarantees, demonstrating that our method surpasses various alternative techniques in terms of both performances in variable selection and overall achieved rewards.
more » « less
Free, publicly-accessible full text available June 10, 2026
FRONT: Foresighted Online Policy Optimization with Interference

Xiang, Liner; Wang, Jiayi; Cai, Hengrui (June 2025, Reinforcement Learning Journal)

Contextual bandits, which leverage baseline features of sequentially arriving individuals to optimize cumulative rewards while balancing exploration and exploitation, are critical for online decision-making. Existing approaches typically assume no interference, where each individual’s action affects only their own reward. Yet, such an assumption can be violated in many practical scenarios, and the oversight of interference can lead to short-sighted policies that focus solely on maximizing the immediate outcomes for individuals, which further results in suboptimal decisions and potentially increased regret over time. To address this significant gap, we introduce the foresighted online policy with interference (FRONT) that innovatively considers the long-term impact of the current decision on subsequent decisions and rewards.
more » « less
Free, publicly-accessible full text available June 10, 2026
On Learning Necessary and Sufficient Causal Graphs

Cai, Hengrui; Wang, Yixin; Jordan, Michael; Song, Rui (December 2023, NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems)

Full Text Available
Estimating ancient biogeographic patterns with statistical model discrimination

https://doi.org/10.1002/ar.25067

Gates, Terry A.; Cai, Hengrui; Hu, Yifei; Han, Xu; Griffith, Emily; Burgener, Landon; Hyland, Ethan; Zanno, Lindsay E. (July 2023, The Anatomical Record)

Full Text Available

Search for: All records